CryptoCurrency Analysis¶

Cryptocurrency:¶

A cryptocurrency (or crypto currency or crypto for short) is a digital asset designed to work as a medium of exchange wherein individual coin ownership records are stored in a ledger existing in a form of computerized database using strong cryptography to secure transaction records, to control the creation of additional coins, and to verify the transfer of coin ownership.

The cryptocurrency market has been volatile from the very beginning, but the last couple of years have been a particularly wild ride for millions of investors around the world. Many have made millions on the big upswings, and yet many have lost large and small investments in the bursting bubbles and sudden market downturns.

Understanding the Problem:¶

In the given assignment we have been given consolidated financial information for the top 10 cryptocurrencies by market cap extracted from CoinMarketCap.com. We want to understand how the prices of these currencies have changed over the course of time and also we want to see the top 3 currencies change in volume over the period of 2016-2019. This will be done for people of two age groups of 17-35 and 60+.

Gathering Data:¶

Once the requirements are clearly understood, the data will then be loaded into python jupyter notebook using pandas library. Once data has been uploaded and viewable, we will check its data type to understand what kind of data we are dealing with. From the given set of data, we will extract the following columns:

1) Currency (Object) 2) Date (Datetime) 3) High (Float) 4) Volume (Int)

Data Wrangling , Validation and Creating Pivot Tables:¶

We will convert the data types to continuous variables so that we can create visualizations. We will also then check for missing data and deal with them accordingly. Once the data has been cleansed, we will now move towards visualization. In order to create the required visuals we will first create two more dataframes in order to meet the required visuals. We will create a dataframe consisting of Currency,Date and High for the first visual, and for the Second Data frame we will select Currency,Date and Volume for the second visual. These dataframes will be converted into pivot tables setting Date as index, Currency as column and High as value in first dataframe where as in the second dataframe, we will first filter the Date to show values from 2016-2019 and top 3 Currencies I.e Bitcoin, Ethereum and Tether and then convert the dataframe into a pivot table where Date is index, Currency is Column and Volume are values.

Visualization:¶

After creating the pivot tables we will create line plot and box plot using matplotlib for the first requirement. A line graph is commonly used to display change over time as a series of data points connected by straight line segments on two axes. It is simple and clearly shows the change in value over the course of time. Box plots are a type of graph that can help visually organize data. Once the box plot is graphed, you can display and compare distributions of data. These visuals are useful particularly in financial analysis and can cater both age groups. The colors used will be sober and visualy appealing to both age groups.

For the second requirement we will again create a line plot and a bar plot using matplotlib. The reason for line graphs is same as it was in the first requirement whereas A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. It is another simple representation and shows the change in volume over a course of time.

Importing Required Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import seaborn as sns
%matplotlib inline 

Reading the CSV File

In [2]:
df = pd.read_csv("D:\\Documents\\Assignment\\consolidated_coin_data.csv", delimiter=",")
print ('Data read into a pandas dataframe!')
Data read into a pandas dataframe!
In [3]:
df.head() #Calling first 5 rows
Out[3]:
Currency Date Open High Low Close Volume Market Cap
0 tezos 4-Dec-19 1.29 1.32 1.25 1.25 46048752 824588509
1 tezos 3-Dec-19 1.24 1.32 1.21 1.29 41462224 853213342
2 tezos 2-Dec-19 1.25 1.26 1.20 1.24 27574097 817872179
3 tezos 1-Dec-19 1.33 1.34 1.25 1.25 24127567 828296390
4 tezos 30-Nov-19 1.31 1.37 1.31 1.33 28706667 879181680
In [4]:
df.dtypes #Checking Datatypes
Out[4]:
Currency       object
Date           object
Open          float64
High          float64
Low           float64
Close         float64
Volume          int64
Market Cap      int64
dtype: object
In [5]:
df['Date'] = pd.to_datetime(df['Date'])
df[['Date']] = pd.DatetimeIndex(df['Date']).year #Converting Datatypes to years for better representation
In [6]:
df.dtypes #Checking updated column
Out[6]:
Currency       object
Date            int64
Open          float64
High          float64
Low           float64
Close         float64
Volume          int64
Market Cap      int64
dtype: object
In [7]:
df.describe() #Understanding the data
Out[7]:
Date Open High Low Close Volume Market Cap
count 28944.000000 28944.000000 28944.000000 28944.000000 28944.000000 2.894400e+04 2.894400e+04
mean 2016.111940 300.719748 309.832808 290.858372 300.947362 8.133058e+08 7.194826e+09
std 1.920268 1373.884718 1416.598612 1325.072673 1374.461259 3.059516e+09 2.469322e+10
min 2013.000000 0.000000 0.000000 0.000000 0.000000 0.000000e+00 0.000000e+00
25% 2014.000000 0.210000 0.210000 0.200000 0.210000 2.418700e+05 6.345143e+07
50% 2016.000000 2.995000 3.090000 2.880000 2.980000 5.212684e+06 3.453673e+08
75% 2018.000000 24.430000 25.530000 23.270000 24.430000 1.554764e+08 3.422403e+09
max 2019.000000 19475.800000 20089.000000 18974.100000 19497.400000 5.350913e+10 3.265025e+11
In [8]:
df.info(verbose=False) #Information about the dataframe
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 28944 entries, 0 to 28943
Columns: 8 entries, Currency to Market Cap
dtypes: float64(4), int64(3), object(1)
memory usage: 1.8+ MB

Checking for Missing Data

In [9]:
missing_data = df.isnull()
missing_data.head(5)
Out[9]:
Currency Date Open High Low Close Volume Market Cap
0 False False False False False False False False
1 False False False False False False False False
2 False False False False False False False False
3 False False False False False False False False
4 False False False False False False False False
In [10]:
df.isnull().sum()
Out[10]:
Currency      0
Date          0
Open          0
High          0
Low           0
Close         0
Volume        0
Market Cap    0
dtype: int64
In [11]:
df.corr() #Checking Correlation between columns
Out[11]:
Date Open High Low Close Volume Market Cap
Date 1.000000 0.184136 0.183393 0.185257 0.184023 0.345563 0.268524
Open 0.184136 1.000000 0.999268 0.998868 0.998551 0.560011 0.953660
High 0.183393 0.999268 1.000000 0.998588 0.999403 0.561062 0.954377
Low 0.185257 0.998868 0.998588 1.000000 0.999205 0.559677 0.954393
Close 0.184023 0.998551 0.999403 0.999205 1.000000 0.560457 0.955012
Volume 0.345563 0.560011 0.561062 0.559677 0.560457 1.000000 0.591818
Market Cap 0.268524 0.953660 0.954377 0.954393 0.955012 0.591818 1.000000

To view the dimensions of the dataframe, we use the .shape parameter.

In [12]:
df.shape # size of dataframe (rows, columns)  
Out[12]:
(28944, 8)
In [13]:
df1 = df[['Currency','Date','High']] #creating new dataframe
In [14]:
df1 = df1.groupby(['Currency','Date'],as_index=False).max() #Using groupby function to create required pivot table
df1 
Out[14]:
Currency Date High
0 binance-coin 2013 53.15
1 binance-coin 2014 32.06
2 binance-coin 2015 8.73
3 binance-coin 2016 5.95
4 binance-coin 2017 53.55
... ... ... ...
79 xrp 2015 0.02
80 xrp 2016 0.01
81 xrp 2017 2.85
82 xrp 2018 3.84
83 xrp 2019 0.51

84 rows × 3 columns

In [15]:
grouped_pivot = df1.pivot(index='Date',columns='Currency',values='High') #Creating Pivot Table
grouped_pivot
Out[15]:
Currency binance-coin bitcoin bitcoin-cash bitcoin-sv cardano eos ethereum litecoin stellar tether tezos xrp
Date
2013 53.15 1156.14 147.49 53.15 53.15 53.15 1156.14 53.15 53.15 147.49 53.15 147.49
2014 32.06 1017.12 0.03 32.06 32.06 32.06 1017.12 32.06 32.06 0.03 32.06 0.03
2015 8.73 495.56 1.22 8.73 0.01 8.73 320.43 8.73 0.01 1.22 0.01 0.02
2016 5.95 979.40 1.01 5.95 0.00 5.95 21.52 5.95 0.00 1.01 0.00 0.01
2017 53.55 20089.00 4355.62 53.55 0.78 53.55 881.94 375.29 0.39 1.21 12.19 2.85
2018 24.91 17712.40 3071.16 243.79 1.33 22.89 1432.88 323.11 0.94 1.07 7.55 3.84
2019 39.57 13796.49 522.09 255.88 0.11 8.59 361.40 146.43 0.16 1.06 1.83 0.51
In [16]:
plt.style.use(['bmh']) # bmh Styling used for Visulization
In [17]:
grouped_pivot.plot(kind='line',figsize=(10,5))

plt.title('Change in Price of Currencies',fontsize = 20)
plt.ylabel('Price',fontsize = 15)
plt.xlabel('Years',fontsize = 15)
plt.legend(fontsize = 10)
plt.show()
In [18]:
grouped_pivot.plot(kind='box' ,figsize=(14, 5))

plt.title('Change in Price of Currencies Over the Course of Time')
plt.ylabel('Price')
plt.xlabel('Years')
plt.yticks(np.arange(0, 21000, 2500))
plt.show() # need this line to show the updates made to the figure
In [19]:
df2 = df[['Currency','Date','Volume']]
df2 = df2.groupby(['Currency','Date'],as_index=False).max()
In [20]:
df3 = df2[df2['Date'] > 2015]
df3.head()
Out[20]:
Currency Date Volume
3 binance-coin 2016 19773600
4 binance-coin 2017 1730780032
5 binance-coin 2018 637020992
6 binance-coin 2019 742382920
10 bitcoin 2016 363320992
In [21]:
df3_pivot = df3.pivot(index='Date',columns='Currency',values = 'Volume')
df3_pivot
Out[21]:
Currency binance-coin bitcoin bitcoin-cash bitcoin-sv cardano eos ethereum litecoin stellar tether tezos xrp
Date
2016 19773600 363320992 7399410 19773600 2369690 19773600 199408000 19773600 2369690 7399410 2369690 15967600
2017 1730780032 22197999616 11889600512 1730780032 645155968 1730780032 5179829760 6961679872 537916032 4687949824 280844992 8108389888
2018 637020992 23840899072 5377260032 637020992 1713769984 4870720000 9214950400 3481550080 1513270016 6967777735 15960800 9110439936
2019 742382920 45105733173 4522945333 1701098408 340512939 5394932035 18661465873 6442000276 837742777 53509128965 123832790 9415068271
In [22]:
df4 = df3[(df3['Currency'] == 'bitcoin') | (df3['Currency'] == 'ethereum') | (df3['Currency'] == 'tether')]
df4_pivot = df4.pivot(index='Date',columns='Currency',values = 'Volume')
df4_pivot
Out[22]:
Currency bitcoin ethereum tether
Date
2016 363320992 199408000 7399410
2017 22197999616 5179829760 4687949824
2018 23840899072 9214950400 6967777735
2019 45105733173 18661465873 53509128965
In [23]:
df4_pivot.plot(kind='bar',figsize=(10, 5))
plt.title('Change in Volume of Top 3 Currencies from 2016-2019', fontsize = 15)
plt.ylabel('Volume')
plt.xlabel('Years')
plt.legend(fontsize = 10)
plt.show() 
In [24]:
df4_pivot.plot(kind='line',figsize=(10, 5)) #Line Chart for 
plt.title('Change in Volume of Top 3 Currencies from 2016-2019 ', fontsize = 15)
plt.ylabel('Volume')
plt.xlabel('Years')
plt.xticks(np.arange(2016, 2020, 1))
plt.legend(fontsize = 10)
plt.show()